4 research outputs found

    Experiments on predictability of word in context and information rate in natural language

    Get PDF
    Based on data from a large-scale experiment with human subjects, we conclude that the logarithm of probability to guess a word in context (unpredictability) depends linearly on the word length. This result holds both for poetry and prose, even though with prose, the subjects don't know the length of the omitted word. We hypothesize that this effect reflects a tendency of natural language to have an even information rate

    Zipf's Law and Avoidance of Excessive Synonymy

    Full text link
    Zipf's law states that if words of language are ranked in the order of decreasing frequency in texts, the frequency of a word is inversely proportional to its rank. It is very robust as an experimental observation, but to date it escaped satisfactory theoretical explanation. We suggest that Zipf's law may arise from the evolution of word semantics dominated by expansion of meanings and competition of synonyms.Comment: 47 pages; fixed reference list missing in v.
    corecore